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Abstract: The forest stand database of Bilahe Forestry Bureau, Inner Mongolia of China was taken as an example to demonstrate the 
whole process of building a temporal geodatabase by means of reengineering. The process was composed of establishing a conceptual data 
model from the initial database, constructing a logical database by means of mapping, and building a temporal geodatabase with the help 
of Computer-Aided Software Engineering (CASE) tool and Unified Markup Language (UML). The results showed that as the reengi¬ 
neered forest stand geodatabase was dynamic, it could easily store the historical data and answer time related questions by Structured 
Query Language (SQL), meanwhile, it maintains the integrity of database and eliminates the redundancy. 
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Introduction 

Organizations are turning to system reengineering as a means of 
upgrading their existing information systems in situations where 
it appears to be a less expensive alternative to system replace¬ 
ment. Re-engineering can be defined as the process of discover¬ 
ing how a system works. It requires identifying and understand¬ 
ing all components of an existing system and the relationships 
between them (Alhajj 2003). Lurthermore, reengineering is nec¬ 
essary to semantically enrich and document a database in order 
to avoid throwing away huge amounts of data stored in existing 
legacy database if the owner of an existing database wants to 
maintain or adjust the database design (Alhajj 2003; de Guzman 
et al. 2006). 

The re-engineering process derives the conceptual schema 
from the existing database and the objective is to extract and 
know as much necessary information as possible about the con¬ 
ceptual model that led to the legacy database being re-engineered, 
namely, from legacy database to it higher conceptual data model. 
There is no universal agreement on how to do the database re¬ 
engineering. Although methods vary from person to person, they 
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are all under the guidelines of reducing redundancy, eliminating 
null values and maintaining the integrity of database (Cohen and 
Feldman 2003). 

More and more organizations prefer to real-world spatial data¬ 
base with accountability and traceability. These requirements 
lead to the replacement of the usual ‘update-in-place’ policy by 
an ‘append-only’ policy that retains all previous records in the 
database, namely, transaction-time policy (Sotnykova et al. 2005; 
Zhao et al. 2004). Various temporal data models and correspond¬ 
ing temporal data dependencies and temporal normal forms have 
been proposed. Although these temporal models are very diverse, 
they can be grouped into two types: tuple time-stamping (for 
example Dey et al. 1996) and attribute time-stamping (for exam¬ 
ple Liao et al. 1999). 

The importance of spatial temporal database is not only recog¬ 
nized by database engineers, but also by database users and prac¬ 
titioners. 

Problem identification and objectives of the study 

Bilahe Forestry Bureau, Inner Mongolia, China is located in 
Daxing’anling mountainous area of northeast of China. It covers 
an area of 47 000 ha, most of which is densely forested by three 
tree species, i.e., Larix gmelini, Quercus mongolica and Betula 
platyphylla. Three large scale forest inventories were conducted 
in 1980s, 1990s and 2000s, respectively, since its foundation. 
Along with yearly logging plan, the Bureau carries out a com¬ 
plementary inventory every year. The management units are 
classified into three levels, e.g. forest stand, forest compartment 
and forest farm. Each forest farm can have one to many forest 
compartments that are composed of one to many forest stands. 
The data of forest inventory are detailed to each forest stand, and 
then summed up to higher management units. 

Bilahe Forestry Bureau was mainly logging-oriented and the 
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natural forest landscape had been dramatically damaged due to 
the gradually stronger human intervention. In the early of 1999, 
the “natural forest restoration” project after the unhistorical del¬ 
uge of 1998 in China came into being, aiming to protect the natu¬ 
ral forest of Daxing’anling forest region. Since then, Bilahe For¬ 
estry Bureau shifted its function from logging-oriented to ecol¬ 
ogy-oriented. Temporal information plays a crucial role in such a 
case. The current inventory data are loosely managed by Micro¬ 
soft Excel. They contain a group of 12 spreadsheets of each year, 
each of which distinguishes itself a major topic relevant to forest 
stand (e.g. terrain, soil, regeneration, disease). The spreadsheets 
are linked by the unique identification of forest stand. Temporal 
information is maintained by different spreadsheet of the same 
topic. Although there are forest stand map and forest resources 
distribution map, they are in hard copy instead of spatial data 
format. Many problems of the current database have been identi¬ 
fied, some of which are 1) lack of data integrity, 2) lack of a 
rational database schema and 3) difficult to retrieve temporal and 
spatial information. 

The decision-makers of Bilahe Forest Bureau are long-timely 
embarrassed by lack of a forest stand database which can provide 
temporal and spatial information required for better forest re¬ 
sources management. The main objective of this study is to reen¬ 
gineer the current database to facilitate the spatial and temporal 
data retrieval. The sub-objectives are 1) to establish a rational 
entity-relation model, 2) to reengineer the current database 
schema in accordance with the protocols of relational data model 
and 3) to build a temporal geodatabase in ArcGIS. 

Process of re-engineering 

The process of re-engineering is composed of establishing a 
conceptual data model from the initial database, constructing a 
logical database by means of mapping and building a temporal 
geodatabase with the help of CASE tool and UML. 

Conceptual data model building by Entity-relation Diagram 
(ERD) 

Entity relation model is one of the most commonly used tools to 
simplify the relational real world, describing data as entities, 
associations and attributes (Chen and Lu 1997). There is no ab¬ 
solute rule of how to define an entity. The most obvious charac¬ 
teristics of the model are comparatively independent existence. It 
is also better to take the stability into consideration when defin¬ 
ing an entity. The entities identified in this study were forest 
farm, compartment and forest stand because these forest man¬ 
agement units are comparatively independent of each other and 
comparatively stable over time. Attributes in ER model can be 
simple or composite, single-valued or multi-valued and stored or 
derived. In this study, attributes were generally borrowed from 
the original database after the process of normalization to elimi¬ 
nate the derived attributes, and to separate the repeating group 
(multi-value attribute), partial functional dependencies and the 
transitive functional dependencies. Furthermore, the cardinality 
of associations were also determined based on the user’s view on 


database, including order of the relationship (1:1, 1:M or M:N) 
and the optimality (may or must). 

For the sake of simplicity, tuple time-sampling strategy was 
adopted to represent time. For the entities, we stamp them when 
they come into being (Attribute “from”), which is a point event. 
For the associations, we identify “belong” relationship as an 
interval time envent (for example, stand A belongs to compart¬ 
ment B from year X until year Y). The change of associations 
was regarded as point time event (for example, stand A’s ID 
changed from 1 001 to 10 001 in year 2000). We also employed 
the idea of temporal dependency, which was derived from the 
stability of entity (Liu and Song 2000) and expressed by weak 
entity. Weak entities have no key attributes of their own but a 
partial key, which is an or a set of attributes that can uniquely 
identify weak entities together with its identifier’s primary key. 
Weak entities are identified in this case mainly because entities 
with different stability over time should be separated for sake of 
maintaining temporal dependency. For weak entities, we as¬ 
sumed that inventory was a point time event although inventory 
took time. Each weak entity has a set of attributes of the same 
stability. Finally, 10 entities, 5 associations and relevant attrib¬ 
utes in this study were identified (see Fig. 1, only time attributes 
are present in order for simplicity). 

Logical database building by mapping 

Mapping is the process of translating ER conceptual data model 
to relational data model (RM). The major processes of mapping 
entity types are 1) turning each entity type into a relation, choos¬ 
ing an appropriate key as the primary key of the relation and 2) 
for any weak entity type, also creating a relation. The only dif¬ 
ference in treatment is that in the relation we should also include 
the attributes that form the key of owner entity type. The key of 
the relation is formed together with the partial key. The mapping 
of association types depends largely on their cardinality. Basi¬ 
cally, 1:1 and 1:N association types can be accommodated by 
adding attributes to the relation of one of the participating entity 
types. For an N:M association type R, clearly posting a foreign 
key in either of the two participating entity types is not valid and 
a new relation must be added to the database schema. 

Geodatabase with CASE tool and UML 

With the release of ArcGIS, ESRI introduced a new object- 
relational model called geodatabase. This new model is imple¬ 
mented as extension to the standard relational model by integrat¬ 
ing it with object-oriented concepts in a manner that allows geo¬ 
graphic objects to be modelled with their behaviors. Geodatabase 
supports many object-oriented concepts such as inheritance, 
encapsulation, polymorphism etc, providing database developers 
the ability to build more complex behaviors into objects. ArcGIS 
automatically provides the object-relational mapping and man¬ 
ages the integrity of the data within database. There are three 
general strategies to create geodatabase, namely, migrating exist¬ 
ing databases to geodatabase, using tools in ArcCatalog or Arc- 
Toolbox to create geodatabase and using UML (Unified Markup 
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Language) and the CASE (Computer-Aided Software Engineer¬ 
ing) tools to build geodatabase. Applying CASE tool for geoda¬ 
tabase design has become increasing popular recently. The gen¬ 
eral strategy for using UML and CASE tool to design and create 
geodatabase involves 3 steps. Firstly, a geodatabase will be de¬ 


signed in UML model (see Fig. 2). Then it is exported to XML 
(Metadata Interchange file) or Microsoft Repository. Finally, 
Schema Wizard in ArcCatalog (CASE tool) will be applied to 
create data schema (see Fig. 3). 



Fig. 1 Entity-relation diagram of database 


Conclusions and discussions 

As the Forestry Bureau of Bilahe shifted its mission from log¬ 
ging-oriented to ecology-oriented, it is more desirable to keep all 
historical inventory data in an integrated temporal database to 
facilitate data retrieval. The methodology of temporal database 
(TDB) and technology of information system re-engineering 
were applied to fulfil this task. During the process of re¬ 
engineering, a conceptual relational data model was re-built and 
expressed by Entity-relation Diagram. The tuple time-sampling 
strategy was adopted to represent time in database and weak 
entities were defined to maintain temporal dependency. Further¬ 
more, the data dictionary (ER diagram, database schema, etc.) 
produced in the process of reengineering makes operation and 
maintenance of database easier. 

In order to express history inventory information spatially, a 
relational object-oriented geodatabase was also built using CASE 


tool and UML. The most important characteristics of the reengi¬ 
neered database are that it is a “dynamic” database, which com¬ 
prises time dimension and can easily express in database the 
change event of entities, associations and attributes. They can 
keep all the historical data in an integrated database so that it is 
easier to maintain database’s integrity and to answer time related 
questions by SQL. 

The reengineered database is a relational database and can be 
applied by any database management system (DBMS) which 
follows the protocol of relational database. It should be realized 
that a well designed database is the first step to well operated 
information system. In order to manage forest resources properly, 
a well tailored information system is also required to develop. 
How to develop a forest stand information system is out of the 
scope of this study, but it is equally crucial for forest resources 
management in Bilahe Forestry Bureau. 
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Fig. 3 The steps of creating geodatabase with the help of UML and 
CASE tool 
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